Search | WHO COVID-19 Research Database

Supervised and Self-Supervised Pretraining Based Covid-19 Detection Using Acoustic Breathing/Cough/Speech Signals

Chen, X. Y.; Zhu, Q. S.; Zhang, J.; Dai, L. R..

2022 Ieee International Conference on Acoustics, Speech and Signal Processing (Icassp) ; : 561-565, 2022.

Article in English | Web of Science | ID: covidwho-2191814

ABSTRACT

A rapid-accurate detection method for COVID-19 is rather important for avoiding its pandemic. In this work, we propose a bi-directional long short-term memory (BiLSTM) network based COVID-19 detection method using breath/speech/cough signals. Three kinds of acoustic signals are taken to train the network and individual models for three tasks are built, respectively, whose parameters are averaged to obtain an average model, which is then used as the initialization for the BiLSTM model training of each task. It is shown that such an initialization method can significantly improve the detection performance on three tasks. This is called supervised pre-training based detection. Besides, we utilize an existing pre-trained wav2vec2.0 model and pre-train it using the DiCOVA dataset, which is utilized to extract a high-level representation as the model input to replace conventional mel-frequency cepstral coefficients (MFCC) features. This is called self-supervised pre-training based detection. To reduce the information redundancy contained in the recorded sounds, silent segment removal, amplitude normalization and time-frequency masking are also considered. The proposed detection model is evaluated on the DiCOVA dataset and results show that our method achieves an area under curve (AUC) score of 88.44% on blind test in the fusion track. It is shown that using high-level features together with MFCC features is helpful for diagnosing accuracy.

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL